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Abstract 

In this paper, we present approximation algorithms for combinatorial optimization problems 
under probabilistic constraints. Specifically, we focus on stochastic variants of two important 
combinatorial optimization problems: the k-center problem and the set cover problem, with un- 
certainty characterized by a probability distribution over set of points or elements to be covered. 
We consider these problems under adaptive and non-adaptive settings, and present efficient ap- 
proximation algorithms for the case when underlying distribution is a product distribution. In 
contrast to the expected cost model prevalent in stochastic optimization literature, our problem 
definitions support restrictions on the probability distributions of the total costs, via incorporat- 
ing constraints that bound the probability with which the incurred costs may exceed a given 
threshold. 



1 Introduction 

A prevalent model to deal with uncertain data in optimization problems is to minimize expected 
cost over an input probability distribution. However, the expected cost model does not adequately 
capture the following two aspects of the problem. Firstly, in many applications, constraint violations 
cannot be modeled by costs or penalties in any reasonable way (e.g., safety relevant restrictions 
like levels of a water reservoir). Thus, if the problem constraints involve an uncertain parameter, 
one would rather insist on bounding the probability that a decision is infeasible. This leads to 
probabilistic or chance constraints of type: 

P{h{x,0 < 0) > 1 - /9 

where x and ^ are decision and random vectors, respectively, < 0" refers to a finite set of 

constraints, P is a probability measure, and /? is a small input constant. 

Another criticism of expected value measure is that it fails to capture the risk associated with 
the decisions: two decisions are valued equally if they have same expected cost. However, it can be 
the case that while one decision incurs moderate cost under all scenarios, the other incurs a huge 
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cost for a disaster scenario with non-negligible probability. A risk averse user will naturally prefer 
the former decision. Various measures have been proposed in finance and stochastic optimization 
literature to capture this notion of risk averseness. A popular measure is the 'value-at-risk (VaR)' 
measure, which is widely used in financial models, and has even been written into some industry 
regulations 0, For a given risk aversion level p, value-at-risk is given by the smallest value 7 
such that probability that objective cost exceeds 7 is less than p. This leads to the probabilistic 
constraint: 



where f{x,£,) is the objective value for decision x in scenario ^. 

In this paper, we develop approximation algorithms for such probabilistically constrained op- 
timization problems. Specifically, we look at stochastic variants of two important combinatorial 
optimization problems: the k-center problem and the set cover problem, with uncertainty charac- 
terized by a probability distribution over subset of points or elements to be covered. We study the 
problems under "non- adaptive" and "adaptive" settings. In non-adaptive setting, the entire set 
cover (k-center) must be chosen before the random element set is known. The goal is to minimize 
the covering cost (clustering distance) while satisfying a constraint that probability of covering a 
random subset of elements is higher than a given input threshold. In adaptive setting, the set cover 
(k-center) can be chosen adaptively for each scenario after observing the random element set. The 
goal is to determine the quality of optimal adaptive solution using value-at-risk (VaR) measure, 
that is, determine the minimum value 7 such that probability that the covering cost (clustering 
distance) exceeds 7 is less than p. Note that these two settings capture the two problem aspects 
mentioned in the previous paragraph. 

Below we give formal definitions of our optimization problems and assumptions made on the 
statistical information available; followed by a summary of results and related previous work. 

Non-adaptive stochastic k-center: Consider a set V n vertices. Assume that distance 
d{u, v) between two vertices u and u in y is given by a graph metric G = {V, E). The deterministic 
k-center problem is to find a subset C C y, |C| < A;, which minimizes the distance r such that 



In the stochastic k-center problem, the subset of V that actually needs to be served is given by 
a random variable V , where each vertex Vi appears in V independently with probability pi. The 
problem is to choose a set C C y, |C| < /c, which minimizes the distance r such that 



for a small input constant < /3 < 1. 

Adaptive stochastic k-center In adaptive setting, the k centers will be chosen after the random 
subset V becomes known. Thus, the /c-center solution C is itself a random variable, and depends 
on the random subset V . The problem is to compute the value-at-risk, that is, the distance r such 
that 



P(/(x,0>7)<P 



max(i(?;, C) <r 



P(max d{v, C) < r) > 1 — p 




Here C denotes optimal /s-center solution for subset V. 
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Non-adaptive stochastic set cover Given a universe of n elements E = {ei, 62, • • • , e„}, and 
a family 5 of m subsets of E. The deterministic set cover problem is to find the minimum cost 
subcollection C Q S such that every element in E is covered by some set in C. In the stochastic 
set cover problem, the elements to be covered are a random subset E of E, where each element 
Cj appears independently in E with probability pj. The problem is to find minimum cost sub- 
collection C Q S such that the probability that every element in E is covered is by some set in C 
higher than an input threshold 1 — /C- 



Adaptive stochastic set cover In adaptive setting, the set cover will be chosen after the random 
subset of elements E becomes known. The problem is to compute the value-at-risk B, that is the 
minimum value B such that 

P{J2c^>B)<p 

Here C denotes optimal set cover for random subset E. 



1.1 Summary of our results 

For the k-center problems (non-adaptive and adaptive), we present polynomial-time dynamic pro- 
gramming algorithms that give optimal solutions for tree metrics. Moreover, we show that the 
algorithms for tree metrics can be extended to give efficient PTAS for planar graph metrics., and 
more generally a class of graphs called 'bounded genus' graphs. Here, the approximation is only in 
the number of centers; the probabilistic constraint holds exactly. For set cover problem, we give an 
0(logn)-approximation algorithm for the non-adaptive case. We also show that for the adaptive 
case of this problem, verifying the probability threshold is atleast as hard as the problem of counting 
maximum independent sets of a graph, and hence is likely to be very hard to approximate. 

We use combinatorial optimization techniques like dynamic programming to obtain fast and 
accurate algorithms for stochastic optimization problems. A common limitation of previous work 
on approximation algorithms for probabilistically constrained optimization problems is that 
the probabilistic constraint cannot be satisfied accurately. That is, an approximation of type 
> (1 + ^i)B) < (1 + e2)p is obtained. We overcome this limitation by taking advantage 
of special structure of the problems in case of product distributions, and obtain approximation 
algorithms where probabilistic constraints hold exactly. 



1.2 Related Work 

There has been significant recent interest in studying stochastic optimization models from ap- 
proximation algorithms perspective. A variety of approximation results have been obtained for 
the expected value models where a compensation or recourse is available for failed scenarios (the 



two-stage recourse models), see, e.g., jl4l | and references therein. But, results on probabilistically 
constrained and risk-averse models are relatively limited. In general, probabilistic constraints are 
difficult to handle: reasons being the inherent nonconvexity of the feasible set of a probabilistic 
constraint, as well as computational difficulty of estimating the probability term itself. However, 
some success has been achieved in obtaining approximation algorithms for specific combinatorial 
optimization problems by taking advantage of structure of the problem. In this regard, closest to 
our work are [a] and Goel et al 0] focus on stochastic load balancing problems, where item sizes 
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are independent random variables following Bernoulli, exponential, or Poisson distributions speci- 
fied in the input. They obtain approximation algorithms for stochastic bin packing and knapsack 
problems under probabilistic constraints that limit the overflow probability of a bin or knapsack. 
These problems fall into the non-adaptive framework described above. Cormode et al 0] emphasize 
the application of "uncertain k-center" and similar clustering problems in probabilistic databases, 
and present various bi-criteria approximation (both in number of centers and maximum distance 
to a center) algorithms for an expected cost model. Their results include a O(logn) approximation 
in number of centers for our non-adaptive k-center problem, but only under an assumption that 
the individual probabilities pi are polynomial. As mentioned in the previous subsection, our work 
improves upon this result for tree and planar graph metrics. We give exact algorithm for tree 
metric case, and efficient PTAS for planar graphs, where the approximation factor and running 
time does not depend on the probabilities pi. The adaptive k-center problem was not considered 
in the referred work. 

A recent unpublished work by Swamy [13|] considers two stage risk-averse models for stochastic 
set cover and related combinatorial optimization problems. In the two stage recourse model, some 
sets can be chosen in the first stage at a low cost, and then if a scenario is not covered, more sets can 
be bought in the second stage as a recourse action. The risk averse problem is to minimize the sum 
of first stage cost and value-at-risk for the second stage. It was observed that if the value-at-risk for 
second stage is fixed to be 0, the problem reduces to chance-constrained set cover without recourse 
- same as our non-adaptive set cover problem. Although the algorithms in 13|] can be used under 
more general assumptions of "black box distributions", we present faster algorithms that achieve 
better approximation factors for the special case of product distributions. Specifically, in contrast 



to the results in [13[|, we do not incur any approximation in the probabilistic constraint, and the 



running time of our algorithms is independent of the input threshold p. 

An alternate approach widely used for dealing wi th p robabilistic constraints focuses on replac- 
ing such constraints by more tractable constraints [1, 1 111 . 0] so that any solution satisfying the new 
constraints also satisfies the original probabilistic constraints with high probability. Observe that 
this type of relaxation is opposite to what one aims for in the design of approximation algorithms. 
Although some approximation results have been obtained [3, [U, IB] , they apply only to problems in- 
volving continuous random variables whose distribution satisfies a certain concentration-of-measure 
property. These conditions are not fulfilled by problems with 0—1 random vectors considered in 
this paper. 



2 Non-adaptive stochastic k-center problem 

In this section, we look at the non-adaptive k-center problem. We present a dynamic program- 
ming algorithm for choosing a set C C V of k centers that maximizes the 'success probability' 
P( max^gy d[v, C) < r) for a given distance r. The final solution can then be found by doing 
binary search for optimal r over a sorted list of (2) distances. Below, we first describe an exact al- 
gorithm for tree metrics. The algorithm is similar in spirit to the dynamic programming algorithm 
given in 0] for (deterministic) A:-median problem under tree metrics. In the sequel, we extend this 
algorithm to obtain approximation algorithms for more general graph metrics. 
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2.1 Exact algorithm for tree metrics 

Our algorithm for tree metrics is based on a key property of om' model, that is, "for any subtree 
in a tree, once the number of centers in the subtree and the center closest to its root are fixed, 
the probability of success for the subtree is independent of the rest of the tree". The reason 
this property holds lie with the structural properties of the problem on a tree graph, and our 
independence assumption on probability of vertices. The hierarchical structure of tree ensures that 
the closest center to any vertex in a subtree either lies inside the subtree or is the center closest to 
the root of the subtree. The independence assumption on vertices implies that inter-dependencies 
between disjoint subtrees are caused only due to the common centers used to cover them. Once the 
closest center to root and number of centers in the subtrees are fixed, the joint probability of success 
for a tree can be expressed as product of success probabilities for its subtrees. This observation 
will give us the optimal substructure property required for a dynamic programming approach. 
We make these ideas more precise in the following. 



Dynamic programming algorithm Given a rooted tree T = iV^E) with root vq. T^, denotes 
the subtree of T under vertex v (including v), e{v,t) denotes the t^^ child edge of vertex and 
Tg^^^i) denotes the subtree of r„ on the left of the edge e{v,t) (including v and edge e{v,t)). Also, 
ts denotes the total number of child edges of a vertex Vg- 

Now, for any subtree T = {S,E} of T, define function H{T,j) as maximum probability (i.e., 
the probability under optimal choice of centers) that random subsets 5" of 5 can be covered by 
j-centers. Given clustering distance r, we say that a set of vertices is covered by a set of centers iff 
for every vertex there is some center within distance r. 

H(T,j) = max Pr{Cj covers S) 

Cj QS,\Cj\=j 

Note that H{T,k) gives the desired optimal value. We now define function R{T,j,v) which will 
prove to be an essential tool for computing values of H{-). Suppose it is given that w is a closest 
center to the root of the subtree T, then R{T,j, v) is defined as maximum probability that a set of 
j — 1 centers in S, along with the center v, can cover a random subset of S. That is, 

R(T,j,v)= max Pr(C5_i U covers 5) 

c,_iCS,|Cj_i|=i-i 

We employ a dynamic programming type procedure that proceeds bottom up in the tree and 
computes all values of i?(Te(^^^;), j, ^2) and H{Ty,j) (and finally H{T,k) for the whole tree T). 



The initial values: For any leaf v, 



1 ifi>i 

I — Pv o.w. 



Also, for any vertex vi, re(„^ g) = ''^i- So, for any pair of vertices vi,V2. 

( 1 ifi > 1 

R{Te{vifi),j,V2) = I 1 \i j = l,d{vi,V2) <r 

I 1 — Pi o.w. 
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e(t>i,i-l) 



Figure 1: Tree T'e(^^ ^) and its subtrees 



Computation of H{T^^^j): Let C be the optimal set of j-centers for tree T^^, and Vr be the 
closest center to vi in C. Then, by the definition: 

Therefore we can compute H{Ty^^j) using the following relation: 

H{T^^,j)= max -R(Te(^i,ti), J>2) 

Computation of R{T^(^^-^^i), j, V2): By definition, V2 is closest vertex to the root vi of the subtree; 
and I <ti, the number of child edges of vi. If vi is a leaf, then ti = 0, and R{e{vi,Q), j, V2) is given 
by the initial values. Assume that vi is not a leaf and I > 1. Let ^3 be the vertex on the other end 
of edge e{vi,l) (refer Figured]). The value of R{Tg(^y-^^i), j,V2) is given by the following recursion: 

RiTe{vi,l),j,V2) = maXjjj2e[0,i]{^(7'e(„i,/-l), jl,W2) • R{T^{v'i,t3),j - jl + ^,V2), 

R{Te{vi,l-l),j2,V2) ■ H{Ty^,j - j2)} 

The reason this equation holds is as follows. Since V2 was the closest center to the root of 
subtree Tg(^,^^;), it remains closest center to the root of subtree T^^f^^j^ ^i_iy However for subtree Ty^ 
(same as re(^,g j^)), there are two possible choices: either V2 remains the closest center, or a center 
in Tt,3 is the closest center. The two terms on the right represent these two choices. The product 
expression follows from the independence property discussed in the beginning of this section. 

We order the vertices of the tree from bottom to top and left to right. At stage i, we compute 
values i?(Tg(^^^;), j, 112) for z*^ vertex vi picked in this order. For a given vertex vi, i?(re(„^ j, ^2) 
is computed for increasing values of / and j, and all choices of V2 in T. Then, we compute values 
H{Ty-^,j), and go on to the next stage. Thus, at any stage, all the terms in above expression are 
already known from computations in the previous stages. 

Computing the optimal solution Assume that we have calculated (and recorded) all values 
of H{-) and R{-). H{T,k) gives the optimal probability. The corresponding optimal set of k- 
centers can be generated by carrying out another pass over this table of values. This is a standard 
component of any dynamic programming procedure, we omit the details here. 
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Running time complexity For each edge e{vi, I) and each vertex V2, we compute R{Tg(^yj^^i^, j, V2) 
for all k values of j. Also, each computation of R{-) requires taking max over atmost 2k terms. 
Therefore total complexity of computing the terms R{-) is 0{'n?k'^). For each vertex v, there are 
atmost k values of j for which H{-) need to be computed. And each of these computations takes 
0(n) steps. Hence, total complexity of computing terms H(-) is 0{n^k). 

Also, as a pre-procedure for the algorithm we compute the distance-matrix of the tree (this 
requires O(n^) steps). And, the algorithm needs to be repeated for logn^ possible values of r. 
Thus, total complexity of the procedure is 0{n'^k'^ \ogn). 

2.2 Extensions 

Extensions to more general graph metrics In this section, we extend our algorithm to obtain 
efficient PTAS for planar graphs and a more general class of graphs called "bounded genus graphs" . 
The heart of this approach lies in the adaptability of the structure of c-outerplanar graphs to 
dynamic programming. A c-outerplanar graph has the property that it can easily be decomposed 
into two subgraphs with just 2c common boundary nodes Now, a dynamic programming 

algorithm similar to our algorithm for tree case can be used. For a given c, let G is a c-outerplanar 
graph. Then, using techniques in [l|, G can be recursively decomposed into c-outerplanar subgraphs 
Gi and G2 with atmost 2c common boundary nodes. The dynamic programming recursion is now 
defined as: 

H{G,j)= niax R{G,j,{vi}) 

R{G,j,{vi}) = maxo<jij2<j |maXf/c{n,},c/^0^(G'i,Ji, {M) ' R{G2,j -ji + \U\,U), 

R{Gi,j2,{vi})-H{G2,j-j2)} 

Since there are n vertices, there are atmost nk values of G and j for which H has to be computed, 
and each computation requires taking max over n^"^ values. So complexity of computing terms 
H(-) is 0{n'^'^~^^k). Similarly, there are n^^'^^k values for which R{-) has to be computed. Each 
computation requires taking max over 2^'^^^k terms. Hence, total running time complexity of above 
procedure is 0(n^'^+^A;^). 

To extend this approach to general planar graphs, we can use graph decomposition concepts 
from Here, we give an outline of the method. The idea is to decompose the planar graph into 
disjoint (c + l)-outerplanar components by copying the nodes in every d-'^ 'level' ml. Then, use the 
above algorithm for resulting (c + l)-outerplanar graph. Note that we are potentially duplicating 
the centers in the copied levels. However, by pigeonhole principle, there exists i G {1, . . . , c} such 
that if we copy levels congruent to jc + i, j > 0, then number of centers increase by a factor of 
atmost 1 + 1/c. This gives a (l + l/c)-approximation in number of centers, with running time 0{rf'). 
A result by Eppstein [3| shows that similar decompositions can be achieved in polynomial-time for 
a more general class of graphs called "bounded genus" graphs. Thus, our approximation algorithms 
extend in a natural way to this class of graphs. 

Extensions to other covering problems Our algorithm can be directly applied to other 
stochastic covering problems on planar graphs, like vertex cover, edge cover and dominating set. 
The basic idea remains the same: once we fix the number of centers (covering nodes or edges) in a 
subgraph and the closet center (s) to its boundary node(s), the probability of covering the subgraph 
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is independent of the rest of the graph. Note however, that for problems with non-uniform cost 
of centers, our dynamic programming algorithm will be pseudo-polynomial (polynomial in 'total 
cost'). 

3 Adaptive stochastic k-center problem 

In adaptive setting, the goal is to find the minimum distance r such that the failure probability 
P(max^gy d{v, C) > r) is less than p. Again, the desired value r could be found by doing a binary 
search over (2) values of r, and testing for each r whether the failure probability is less than p. 
However, evaluating this probability term is not straightforward. Here, a key difference from the 
non-adaptive setting is that a different set of centers C is chosen for each random scenario V , 
optimized for the subset of vertices in that scenario. A brute force approach to find the failed 
scenarios would require solving a deterministic /c-center problem for each of the 2" subsets of V . 

In this section, we propose a dynamic programming algorithm to compute this failure probability 
in polynomial-time for a given value of r. First, we present an exact algorithm for tree metrics, 
and then extend it to more general graph metrics. 

3.1 Exact algorithm for tree metrics 

The basic idea in our algorithm is to characterize each random subset of a subtree via a profile 
{j,d,d') that completely captures its covering properties. Specifically, given a subtree T = {S,E}, 
a random subset S* C S* belongs to a profile {j, d, d') if and only if 

• the minimum number of centers sufficient to cover S within distance r is j, 

• among the covers of size j, minimum distance of a center to the root of T is d, and 

• d' is the maximum distance such that if a vertex v' outside the subtree T and at distance d' 
from its root is a center, then the subtree can be covered using only j — 1 centers. If no such 
vertex v' exists, then d' = —d. 

Note that each subset of vertices belongs to exactly one profile {j,d,d'). This is because there is 
a unique minimum number of centers j required for any subset, and that corresponds to a unique 
minimum distance d of closest center to the root. Also note that using any help v' from outside the 
tree atmost 1 center can be removed out of the j centers - otherwise we could place a center at the 
root and reduce the minimum number of centers to j — 1 . Taking maximum of the distances of all 
such v's from root, we get our unique d' . 

Above argument shows that the profile {j,d,d') define a disjoint partition over the subsets of 
any subtree T. Now, define function DP{T,j, d, d') as the probability of random subsets in T under 
profile {j,d,d'). Then, by definition, the probability of failure is given by: 

Failure probability = ^ DP{T,j,d,-d) (1) 

k<j<n,d 

Here, d can take atmost n possible values - corresponding to possible distances of vertices from 
the root. Now, we are ready to present our dynamic programming algorithm. We use the same 
notations as in the previous section. The algorithm will compute all values Z)P(Tg(„^ j, d, d') in a 
bottom to top, left to right order, finally computing the values DP{T,j, d, —d) that appear in the 
above expression for failure probability. 
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Initial values: For I = 0, Tg^^ q) = ''^^ 

C Pv a j = l,d = 0, d' = ma-K^,^^^d{v,v')<r d{v, v') 
DP{v,j,d,d') = I l-p„ if j = 0,d = 0,d' = 
I o.w. 

Computation of DP(rg(^^ j, d,d'): Now, assume that vi is not a leaf, and / > 1. To compute 
DP{Tf,(^y^ i^,j, d, d') for some /, we reduce it to an expression consisting of function DP{-) on subtrees 
Ti = T^[^^^i-i) and T2 = T^{v2,t2)^ where V2 is the vertex on the other end of edge e{vi, I). We use the 
observation that a random subset V of this tree has a profile {j, d, d' } if and only if = T^riV and 
V2 = T2r\V have profiles {ji,di,d[) and (^'2,^2,^2)' respectively, satisfying either of the following 
conditions: 

• Ji + j2 = j: In this case, we must ensure that the centers in Vi do not help V2 and vice- versa 
so that total minimum number of centers is j. Let w denote the distance d(vi,V2), then we 
require d2 + w > d[, di + w > d2. To get d, the least of di and d2 + w must be equal to d, 
and to get d', the max of d'^ and d'2 — w must be equal to d' . 

• Ji +32 = J + 1 : In this case we must ensure that the centers in Vi help V2 or vice- versa, so 
that total minimum number of centers is j, that is d2 + w < d'^,di + w > d^ or d2 + w > 
d'i,di + w < d'2. To get d, the least of di and d2 + w must be equal to d. To get d' , d'l must 
be equal to d' if V2 is helped by Vi, and ^2 — w must be equal to d' if Vi is helped by V2. 

• Ji + j2 = J + 2: In this case we must ensure that the centers in Vi help V2 and vice- versa, so 
that total minimum number of centers is j, that is d2 < d'l — w,di < d2 — w. To get d, the 
least of di and d2 + w must be equal to d. Only negative values of d'{= —d) have this case. 

It is easy to see that in each of the above cases, the conditions on di,d2 and d'i,d'2 are necessary 
and sufficient to get the joint profile (j, ^1,^2). Let V denotes the collection of profiles {(ji, di, d'^), 
(j2) (^2) ^^2)} satisfying either of the above conditions. Then, using the fact that the profiles are dis- 
joint, and independence assumptions on the probability model, Z)P(Te(„^^^), j, d, d') can be expressed 
as 

DP{Te(v^^i),j,d,d') = Y^j^ DP{T^^^,^ i^i), ji, di, d[) ■ DP{T^(^^^ t^),j2,d2,d'2) 

Observe that due to the specific order in which we compute the values of DP{-), all terms in the 
above expression were already computed in a previous stage. 

Running time complexity For each edge, we compute atmost kri^ values of DP{-) (possible 
values of j and d, d'). For each of these terms we sum over at most Sfcn"* terms. Therefore, total 
complexity is 0{k^n^). The preprocessing time is O(n^) for computing distance pairs, and 0{'n?) 
for assigning initial values. Including the logn^ iterations for binary search on r, the effective 
complexity is 0{k^n^ logn). 

3.2 Extensions 

The algorithm can be extended to more general graph classes and other covering problems on 
graphs, using ideas similar to those discussed at the end of previous section. We omit the details 
here. 
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4 Non-adaptive stochastic set cover problem 

We give an approximation method for non-adaptive stochastic set cover problem by reformulating 
it as a partial set cover problem. The problem (refer Section [1]) can be restated as: 

Em 
i=l CiXi 

X 

s.t. P{E is not covered hj x) < p 

Xi G {0, 1} Vi G [n] 

Here, [n] denotes the set {1, . . . ,n}. The value of 0-1 variable Xi indicates whether set i is chosen 
or not. 

For any element j, let dj denote the collection of sets that cover the element j. Then, indicator 
function Ij{x) = (1 — Xi)^ takes value 1 if j is NOT covered by solution x and otherwise. 

Using the assumptions on our probability model: 

P{E is not covered by x) = 1 — ~ Pj) 

Let Ij = log ^j^p , and / = log Then, the probabilistic constraint is equivalent to: 

1 -nj;/^(^)(l -Pj) < p 
1 1 

^n^-./v^.)- < - — 

J jy ' I — p. 1 — P 

n 

<^'^Ij{x)lj < I 

i=i 

Therefore, we can reformulate our problem as: 

j=l CiXi 

X 

X^ G {0,1} 

which is equivalent to the following integer program: 

Em 
i=l CiXi 

s-t- Eie9j Xi>l- Zj Vj = 1, . . . , n 

Xi e {0,1} Vi = 1, . . . ,m 
G{0,1} Vi = l,...,n 

The above problem can be interpreted as a 'partial set cover problem', where penalty for not 
covering an element j is given by Ij. The partial set cover problem is to minimize the cost of sets 
{c^x) such that the total penalty {"^Zjlj) for uncovered elements is less than a given limit (Z). 
A (| -|- e) log n-approximation algorithm for the partial set cover problem is appears in 0. The 
algorithm can be directly used for the above problem. 
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5 Adaptive stochastic set-cover problem 



In the adaptive setting, our goal is to compute the minimum value B so that probability that cost 
of optimal set cover for a random subset of elements in E exceeds B, is less than p. Given a fixed 
value B, we call the subsets of E with adaptive set cover cost > B as failed subsets, and probability 
of these subsets as failure probability. We show that even for the uniform cost edge cover case, 
the problem of approximating this failure probability is harder than the problem of approximately 
counting maximum independent sets in a graph. An inapproximability result for the latter problem 
appears in which states that this problem cannot be approximated within a polynomial factor 
unless RP=NP (refer Theorem 4 in [10(]). Thus, a reduction from this problem will suggest that 
our problem is hard to approximate as well. 

Given graph G = {V, E) and a parameter fc, we denote the edge cover failure probability by 
f{k). It is the probability of random subsets V olV such that the number of edges in the edge 
cover of V is greater than k. We call such subsets of V, the "failed subsets". Let each vertex 
appears independently in the random subset V with probability p (that is, Pi = p for all i). Denote 
by Ni(G,k) the number of failed subsets containing i vertices. Then, 



fik) = ^m{G,k).p\i-py 



i=k 

Denote the count of maximum independent sets of graph G by I{G){> 1). Let m be the size of a 
maximum independent set in G. We show that computing f{m — 1) with a good approximation 
factor is harder than approximating the number of independent sets I{G). Note that Nm{G,m) 
denotes the number of subsets of V that have m vertices and need m or more edges to cover them. 
The edge cover needs m or more edges to cover m vertices if and only if the m vertices form an 
independent set. Hence, Nm{G,m) = I{G). Therefore, 

n 

f{m) = I{G)p^{l-p)"+ N,{G,m)p\l-pr-' 

i=m+l 

> I{G)p^{l-p)" (2) 
Observe that EHm+i Ni{G,m) < 2" < 2"/(G). Also, assume p < 1/2". Then, 

n 

f{m) = IiG)p'^il-p)"+ N,{G,m)p\l-pr-' 

i=m+l 

P . 



< /(G)p'"(l-pr-'"(l + 2"/(G) 



1 — p^ 



< /(G)p-(i-pr— (1+2".^^) 

< /(G)p'"(l-p)""'"(l + 2) (3) 
From inequalities ([2]) and ([3|), we can conclude that 

1 fim) <^(G')<- ^^"^^ 



3 p"^(l-p)n'm- y ' - pm(^l_ pyn-m 

Thus, if we have a (1 ± e) approximation of f{m), then we could get a (1 ± (e + |)) approximation 
for I{G). These completes the reduction. 
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